Goto

Collaborating Authors

 perception map


Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

Mao, Shunqi, Zhang, Chaoyi, Cai, Weidong

arXiv.org Artificial Intelligence

Existing vision-language models (VLMs) often suffer from visual hallucination, where the generated responses contain inaccuracies that are not grounded in the visual input. Efforts to address this issue without model finetuning primarily mitigate hallucination by reducing biases contrastively or amplifying the weights of visual embedding during decoding. However, these approaches improve visual perception at the cost of impairing the language reasoning capability. In this work, we propose the Perception Magnifier (PM), a novel visual decoding method that iteratively isolates relevant visual tokens based on attention and magnifies the corresponding regions, spurring the model to concentrate on fine-grained visual details during decoding. Specifically, by magnifying critical regions while preserving the structural and contextual information at each decoding step, PM allows the VLM to enhance its scrutiny of the visual input, hence producing more accurate and faithful responses. Extensive experimental results demonstrate that PM not only achieves superior hallucination mitigation but also enhances language generation while preserving strong reasoning capabilities. Code is available at https://github.com/ShunqiM/PM .


Safe Perception-Based Control under Stochastic Sensor Uncertainty using Conformal Prediction

Yang, Shuo, Pappas, George J., Mangharam, Rahul, Lindemann, Lars

arXiv.org Artificial Intelligence

We consider perception-based control using state estimates that are obtained from high-dimensional sensor measurements via learning-enabled perception maps. However, these perception maps are not perfect and result in state estimation errors that can lead to unsafe system behavior. Stochastic sensor noise can make matters worse and result in estimation errors that follow unknown distributions. We propose a perception-based control framework that i) quantifies estimation uncertainty of perception maps, and ii) integrates these uncertainty representations into the control design. To do so, we use conformal prediction to compute valid state estimation regions, which are sets that contain the unknown state with high probability. We then devise a sampled-data controller for continuous-time systems based on the notion of measurement robust control barrier functions. Our controller uses idea from self-triggered control and enables us to avoid using stochastic calculus. Our framework is agnostic to the choice of the perception map, independent of the noise distribution, and to the best of our knowledge the first to provide probabilistic safety guarantees in such a setting. We demonstrate the effectiveness of our proposed perception-based controller for a LiDAR-enabled F1/10th car.


Data-Assisted Vision-Based Hybrid Control for Robust Stabilization with Obstacle Avoidance via Learning of Perception Maps

Murillo-Gonzalez, Alejandro, Poveda, Jorge I.

arXiv.org Artificial Intelligence

We study the problem of target stabilization with robust obstacle avoidance in robots and vehicles that have access only to vision-based sensors for the purpose of realtime localization. This problem is particularly challenging due to the topological obstructions induced by the obstacle, which preclude the existence of smooth feedback controllers able to achieve simultaneous stabilization and robust obstacle avoidance. To overcome this issue, we develop a vision-based hybrid controller that switches between two different feedback laws depending on the current position of the vehicle using a hysteresis mechanism and a data-assisted supervisor. The main innovation of the paper is the incorporation of suitable perception maps into the hybrid controller. These maps can be learned from data obtained from cameras in the vehicles and trained via convolutional neural networks (CNN). Under suitable assumptions on this perception map, we establish theoretical guarantees for the trajectories of the vehicle in terms of convergence and obstacle avoidance. Moreover, the proposed vision-based hybrid controller is numerically tested under different scenarios, including noisy data, sensors with failures, and cameras with occlusions.


Time-Efficient Mars Exploration of Simultaneous Coverage and Charging with Multiple Drones

Chang, Yuan, Yan, Chao, Liu, Xingyu, Wang, Xiangke, Zhou, Han, Xiang, Xiaojia, Tang, Dengqing

arXiv.org Artificial Intelligence

This paper presents a time-efficient scheme for Mars exploration by the cooperation of multiple drones and a rover. To maximize effective coverage of the Mars surface in the long run, a comprehensive framework has been developed with joint consideration for limited energy, sensor model, communication range and safety radius, which we call TIME-SC2 (TIme-efficient Mars Exploration of Simultaneous Coverage and Charging). First, we propose a multi-drone coverage control algorithm by leveraging emerging deep reinforcement learning and design a novel information map to represent dynamic system states. Second, we propose a near-optimal charging scheduling algorithm to navigate each drone to an individual charging slot, and we have proven that there always exists feasible solutions. The attractiveness of this framework not only resides on its ability to maximize exploration efficiency, but also on its high autonomy that has greatly reduced the non-exploring time. Extensive simulations have been conducted to demonstrate the remarkable performance of TIME-SC2 in terms of time-efficiency, adaptivity and flexibility.


Certainty Equivalent Perception-Based Control

Dean, Sarah, Recht, Benjamin

arXiv.org Machine Learning

Machine learning provides a promising avenue for incorporating rich sensing modalities into autonomous systems. However, our coarse understanding of how ML systems fail limits the adoption of data-driven techniques in real-world applications. In particular, applications involving feedback require that errors do not accumulate and lead to instability. In this work, we propose and analyze a baseline method for incorporating a learning-enabled component into closed-loop control, providing bounds on the sample complexity of a reference tracking problem. Much recent work on developing guarantees for learning and control has focused on the case that dynamics are unknown [Dean et al., 2017, Simchowitz and Foster, 2020, Mania et al., 2020].


Robust Guarantees for Perception-Based Control

Dean, Sarah, Matni, Nikolai, Recht, Benjamin, Ye, Vickie

arXiv.org Machine Learning

Motivated by vision based control of autonomous vehicles, we consider the problem of controlling a known linear dynamical system for which partial state information, such as vehicle position, can only be extracted from high-dimensional data, such as an image. Our approach is to learn a perception map from high-dimensional data to partial-state observation and its corresponding error profile, and then design a robust controller. We show that under suitable smoothness assumptions on the perception map and generative model relating state to high-dimensional data, an affine error model is sufficiently rich to capture all possible error profiles, and can further be learned via a robust regression problem. We then show how to integrate the learned perception map and error model into a novel robust control synthesis procedure, and prove that the resulting perception and control loop has favorable generalization properties. Finally, we illustrate the usefulness of our approach on a synthetic example and on the self-driving car simulation platform CARLA.